Poznań
Appendix 1 A Spectral Analysis and L TI-SDE
The chain structure is also convenient to handle streaming data as we will explain later. We first give a brief introduction to the EP and CEP framework. Step 2. We construct a tilted distribution to combine the true likelihood, Step 3. We project the tilted distribution back to the exponential family, q KL( null p nullq) where q belongs to the exponential family. Step 4. We update the approximation term by's in parallel, and uses damping to avoid divergence. The above computation are very conveniently to implement.
- Asia > China > Beijing > Beijing (0.04)
- Africa > Senegal > Kolda Region > Kolda (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- North America > United States > Utah (0.05)
- Africa > Senegal > Kolda Region > Kolda (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (3 more...)
Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization
Liao, Junyi, Zhu, Zihan, Fang, Ethan, Yang, Zhuoran, Tarokh, Vahid
Estimating the unknown reward functions driving agents' behaviors is of central interest in inverse reinforcement learning and game theory. To tackle this problem, we develop a unified framework for reward function recovery in two-player zero-sum matrix games and Markov games with entropy regularization, where we aim to reconstruct the underlying reward functions given observed players' strategies and actions. This task is challenging due to the inherent ambiguity of inverse problems, the non-uniqueness of feasible rewards, and limited observational data coverage. To address these challenges, we establish the reward function's identifiability using the quantal response equilibrium (QRE) under linear assumptions. Building upon this theoretical foundation, we propose a novel algorithm to learn reward functions from observed actions. Our algorithm works in both static and dynamic settings and is adaptable to incorporate different methods, such as Maximum Likelihood Estimation (MLE). We provide strong theoretical guarantees for the reliability and sample efficiency of our algorithm. Further, we conduct extensive numerical studies to demonstrate the practical effectiveness of the proposed framework, offering new insights into decision-making in competitive environments.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States > Pennsylvania (0.04)
- (3 more...)
- Leisure & Entertainment > Games (1.00)
- Transportation (0.92)
- Information Technology (0.67)
More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization
Ichikawa, Yuma, Fujisawa, Yoshihiko, Fujimoto, Yudai, Sakai, Akira, Fujisawa, Katsuki
For extreme low-bit quantization of large language models (LLMs), Double Binary Factorization (DBF) is attractive as it enables efficient inference without sacrificing accuracy. However, the scaling parameters of DBF are too restrictive; after factoring out signs, all rank components share the same magnitude profile, resulting in performance saturation. We propose Multi-envelope DBF (MDBF), which retains a shared pair of 1-bit sign bases but replaces the single envelope with a rank-$l$ envelope. By sharing sign matrices among envelope components, MDBF effectively maintains a binary carrier and utilizes the limited memory budget for magnitude expressiveness. We also introduce a closed-form initialization and an alternating refinement method to optimize MDBF. Across the LLaMA and Qwen families, MDBF enhances perplexity and zero-shot accuracy over previous binary formats at matched bits per weight while preserving the same deployment-friendly inference primitive.
- North America > United States (0.04)
- Europe > Poland > Greater Poland Province > Poznań (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Amortized Causal Discovery with Prior-Fitted Networks
Sypniewski, Mateusz, Olko, Mateusz, Gajewski, Mateusz, Miłoś, Piotr
In recent years, differentiable penalized likelihood methods have gained popularity, optimizing the causal structure by maximizing its likelihood with respect to the data. However, recent research has shown that errors in likelihood estimation, even on relatively large sample sizes, disallow the discovery of proper structures. We propose a new approach to amortized causal discovery that addresses the limitations of likelihood estimator accuracy. Our method leverages Prior-Fitted Networks (PFNs) to amortize data-dependent likelihood estimation, yielding more reliable scores for structure learning. Experiments on synthetic, simulated, and real-world datasets show significant gains in structure recovery compared to standard baselines. Furthermore, we demonstrate directly that PFNs provide more accurate likelihood estimates than conventional neural network-based approaches.
- Europe > Austria > Vienna (0.14)
- Europe > Poland > Masovia Province > Warsaw (0.05)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (4 more...)
Comparing BFGS and OGR for Second-Order Optimization
Przybysz, Adrian, Kołek, Mikołaj, Sobota, Franciszek, Duda, Jarek
Across standard test functions and ablations with/without line search, OGR variants match or outperform BFGS in final objective and step efficiency, with particular gains in nonconvex landscapes where saddle handling matters. Exact Hessians (via AD) are used only as an oracle baseline to evaluate estimation quality, not to form steps. II. Online Gradient Regression (OGR) Online Gradient Regression (OGR) is a second-order optimization framework that accelerates stochastic gradient descent (SGD) by online least-squares regression of noisy gradients to infer local curvature and the distance to a stationary point [3]. The central assumption is that, in a small neighborhood, the objective F (θ) is well-approximated by a quadratic model, so the gradient varies approximately linearly with the parameters. OGR maintains exponentially weighted statistics of recent (θ t, g t) pairs and updates a local model each iteration at negligible extra cost compared to computing the gradient itself [2], [3]. A. Direct multivariate approach In given time T, based on recent gradients g t R d and positions θ t R d for t < T, we would like to locally approximate behavior with 2nd order polynomial using parametrization: f (θ) = h + 1 2 (θ p) T H(θ p) f = H(θ p) for Hessian H R d d and p R d position of saddle or extremum. For local behavior we will work on averages with weights w t further decreasing exponentially, defining averages: v null t
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Europe > Poland > Lesser Poland Province > Kraków (0.04)
- Europe > Poland > Greater Poland Province > Poznań (0.04)
Modeling Retinal Ganglion Cells with Neural Differential Equations
Dobek, Kacper, Jankowski, Daniel, Krawiec, Krzysztof
This work explores Liquid Time-Constant Networks (LTCs) and Closed-form Continuous-time Networks (CfCs) for modeling retinal ganglion cell activity in tiger salamanders across three datasets. Compared to a convolutional baseline and an LSTM, both architectures achieved lower MAE, faster convergence, smaller model sizes, and favorable query times, though with slightly lower Pearson correlation. Their efficiency and adaptability make them well suited for scenarios with limited data and frequent retraining, such as edge deployments in vision prosthetics.
- Europe > Poland > Greater Poland Province > Poznań (0.05)
- North America > United States > Massachusetts > Middlesex County > Natick (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Oceania > Australia > New South Wales > Sydney (0.14)
- Europe > Poland > Greater Poland Province > Poznań (0.05)
- North America > Canada > Quebec > Montreal (0.05)
- (16 more...)
Cortex AISQL: A Production SQL Engine for Unstructured Data
Liskowski, Paweł, Han, Benjamin, Aggarwal, Paritosh, Chen, Bowei, Jiang, Boxin, Jindal, Nitish, Li, Zihan, Lin, Aaron, Schmaus, Kyle, Tayade, Jay, Zhao, Weicheng, Datta, Anupam, Wiegand, Nathan, Tsirogiannis, Dimitris
Snowflake's Cortex AISQL is a production SQL engine that integrates native semantic operations directly into SQL. This integration allows users to write declarative queries that combine relational operations with semantic reasoning, enabling them to query both structured and unstructured data effortlessly. However, making semantic operations efficient at production scale poses fundamental challenges. Semantic operations are more expensive than traditional SQL operations, possess distinct latency and throughput characteristics, and their cost and selectivity are unknown during query compilation. Furthermore, existing query engines are not designed to optimize semantic operations. The AISQL query execution engine addresses these challenges through three novel techniques informed by production deployment data from Snowflake customers. First, AI-aware query optimization treats AI inference cost as a first-class optimization objective, reasoning about large language model (LLM) cost directly during query planning to achieve 2-8$\times$ speedups. Second, adaptive model cascades reduce inference costs by routing most rows through a fast proxy model while escalating uncertain cases to a powerful oracle model, achieving 2-6$\times$ speedups while maintaining 90-95% of oracle model quality. Third, semantic join query rewriting lowers the quadratic time complexity of join operations to linear through reformulation as multi-label classification tasks, achieving 15-70$\times$ speedups with often improved prediction quality. AISQL is deployed in production at Snowflake, where it powers diverse customer workloads across analytics, search, and content understanding.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > San Mateo County > Menlo Park (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- (5 more...)
- North America > United States (0.14)
- North America > Dominican Republic (0.04)
- Europe > Poland > Greater Poland Province > Poznań (0.04)
- (3 more...)
- Research Report (0.68)
- Overview (0.46)
- Summary/Review (0.46)